12 research outputs found
Effort-aware just-in-time defect identification in practice: A case study at Alibaba
National Research Foundation (NRF) Singapore under its AI Singapore Programm
A Differential Testing Approach for Evaluating Abstract Syntax Tree Mapping Algorithms
Abstract syntax tree (AST) mapping algorithms are widely used to analyze
changes in source code. Despite the foundational role of AST mapping
algorithms, little effort has been made to evaluate the accuracy of AST mapping
algorithms, i.e., the extent to which an algorihtm captures the evolution of
code. We observe that a program element often has only one best-mapped program
element. Based on this observation, we propose a hierarchical approach to
automatically compare the similarity of mapped statements and tokens by
different algorithms. By performing the comparison, we determine if each of the
compared algorithms generates inaccurate mappings for a statement or its
tokens. We invite 12 external experts to determine if three commonly used AST
mapping algorithms generate accurate mappings for a statement and its tokens
for 200 statements. Based on the experts' feedback,we observe that our approach
achieves a precision of 0.98--1.00 and a recall of 0.65--0.75. Furthermore, we
conduct a large-scale study with a dataset of ten Java projects, containing a
total of 263,165 file revisions. Our approach determines that GumTree, MTDiff
and IJM generate inaccurate mappings for 20%--29%, 25%--36% and 21%--30% of the
file revisions, respectively. Our experimental results show that state-of-art
AST mapping agorithms still need improvements
JITO: A tool for just-in-time defect identification and localization
Australian Research Counci
What makes a popular academic AI repository?
Many AI researchers are publishing code, data and other resources that
accompany their papers in GitHub repositories. In this paper, we refer to these
repositories as academic AI repositories. Our preliminary study shows that
highly cited papers are more likely to have popular academic AI repositories
(and vice versa). Hence, in this study, we perform an empirical study on
academic AI repositories to highlight good software engineering practices of
popular academic AI repositories for AI researchers.
We collect 1,149 academic AI repositories, in which we label the top 20%
repositories that have the most number of stars as popular, and we label the
bottom 70% repositories as unpopular. The remaining 10% repositories are set as
a gap between popular and unpopular academic AI repositories. We propose 21
features to characterize the software engineering practices of academic AI
repositories. Our experimental results show that popular and unpopular academic
AI repositories are statistically significantly different in 11 of the studied
features---indicating that the two groups of repositories have significantly
different software engineering practices. Furthermore, we find that the number
of links to other GitHub repositories in the README file, the number of images
in the README file and the inclusion of a license are the most important
features for differentiating the two groups of academic AI repositories. Our
dataset and code are made publicly available to share with the community
A differential testing approach for evaluating abstract syntax tree mapping algorithms
IAF-PP; Industry Alignment Fund; Key Technology Research and Development Program of Shandon